Role of Enzyme and Active Site Conformational Dynamics in the Catalysis by α-Amylase Explored with QM/MM Molecular Dynamics

We assessed enzyme:substrate conformational dynamics and the rate-limiting glycosylation step of a human pancreatic α-amylase:maltopentose complex. Microsecond molecular dynamics simulations suggested that the distance of the catalytic Asp197 nucleophile to the anomeric carbon of the buried glucoside is responsible for most of the enzyme active site fluctuations and that both Asp197 and Asp300 interact the most with the buried glucoside unit. The buried glucoside binds either in a 4C1 chair or 2SO skew conformations, both of which can change to TS-like conformations characteristic of retaining glucosidases. Starting from four distinct enzyme:substrate complexes, umbrella sampling quantum mechanics/molecular mechanics simulations (converged within less than 1 kcal·mol–1 within a total simulation time of 1.6 ns) indicated that the reaction occurrs with a Gibbs barrier of 13.9 kcal·mol –1, in one asynchronous concerted step encompassing an acid–base reaction with Glu233 followed by a loose SN2-like nucleophilic substitution by the Asp197. The transition state is characterized by a 2H3 half-chair conformation of the buried glucoside that quickly changes to the E3 envelope conformation preceding the attack of the anomeric carbon by the Asp197 nucleophile. Thermodynamic analysis of the reaction supported that a water molecule tightly hydrogen bonded to the glycosidic oxygen of the substrate at the reactant state (∼1.6 Å) forms a short hydrogen bond with Glu233 at the transition state (∼1.7 Å) and lowers the Gibbs barrier in over 5 kcal·mol–1. The resulting Asp197-glycosyl was mostly found in the 4C1 conformation, although the more endergonic B3,O conformation was also observed. Altogether, the combination of short distances for the acid–base reaction with the Glu233 and for the nucleophilic attack by the Asp197 nucleophile and the availability of water within hydrogen bonding distance of the glycosidic oxygen provides a reliable criteria to identify reactive conformations of α-amylase complexes.


■ INTRODUCTION
Despite the significant progress on the study of enzyme:substrate interactions, the kinetics and thermodynamics of enzyme catalysis or the reaction mechanisms underlying the catalytic power of enzymes, 1−4 the intricacies of enzyme efficiency and catalysis remain an active field of research and discussion. In particular, the study of reaction mechanisms using computational methods has disclosed relevant molecular details, which potentially lead to structure−thermodynamics relations that can be rationalized under a rigorous physicochemical theoretical framework. 3−7 In fact, the computational study of the reaction mechanism of enzymes such as lysozyme or human carbonic anhydrase-I 8,9 led to one of the most accepted hypothesis for the origin of the catalytic power of enzymes among computational biochemists, proposed by Warshel and co-workers, which states that enzymes encompass a preorganized network of dipoles that interacts favorably with transition states and lowers the energy barriers of the catalyzed chemical reactions. 10,11 According to the proposal, the low rate of chemical reactions in homogeneous media is a result of the high energy that is required to reorient the dipoles of the medium to stabilize the transition state of the chemical reaction. 9,11 Alongside the important developments in the origin of the catalytic power of enzymes, and together with the increasing availability of computational resources and more efficient exascalable codes, 12,13 it has become clear that the computational simulation of enzyme systems, and their catalyzed reactions in particular, plays now a pivotal role in the rational design of enzymes as efficient biocatalysts for the future generation of sustainable catalysts. 14−18 Glycosidases are among the most proficient enzyme catalysts in nature, alongside enzymes such as phosphatases or phosphodiesterases. 19,20 Since proficient enzymes interact strongly with transition states from their catalyzed reactions, their activity can be enhanced upon rationalization of enzyme:substrate interactions along every stage of their catalytic cycle. In particular, glycosidases can be enhanced by a factor of up to 10 17 -fold, and they are important candidates for transition state analogue-based inhibitor design, 21,22 as well as appealing bioengineering targets. 23,24 The α-amylase is one of the most studied glycosidases, with important clinical and industrial applications known to date. In mammals, it is responsible for the cleavage of α (1−4) glycosidic bonds in the amylose component of starch, and consequently a primary partner in glucose production for energy acquisition. Consequently, it is also a major target for the development of new inhibitors to treat type-II diabetes. 25,26 In the microbial world, α-amylases have a broader range of substrate specificities, from amylopectins, cyclodextrins, or glycogen to the starch parent substrate, 27 which grants them wide applicability in industry, namely in sugar, textile, baking, paper, and brewing industries. 28 The reaction mechanism of α-amylase has been characterized for pancreatic α-amylase and the starch substrate, where it was shown to occur through a concomitant acid−base reaction and a nucleophilic substitution, involving the catalytic Glu233 and Asp197, to hold a glycosylated enzyme intermediate and release an aglycone; and finally a hydrolysis of the glycosylated enzyme by a water molecule, which is activated by Glu233. 29, 30 Even though the computational kinetic results could reproduce the activation energy, obtained from experimentally determined k cat 31 under the transition state theory framework (14.8 vs 14.4 kcal·mol −1 ), 32 to our knowledge, there are no studies exploring the reaction space of α-amylase in a proper thermodynamic ensemble.
Early work by Pinto et al., conducted in our group, using hybrid quantum mechanics/molecular mechanics (QM/MM) methods to study the reaction mechanism of human pancreatic α-amylase, identified the glycosylation step as the rate-limiting step of the reaction. 29 The involved transition state occurred in a dissociative manner, with a concerted late formation of the glycosidic bond with the nucleophilic Asp197 upon cleavage of the substrate's glycosidic bond, assisted by the Glu233 acid. These observations agreed with other studies in glycosidases. 33,34 In the same study, the high solvent accessibility at the enzyme's active site was pointed as a key aspect to achieve full catalysis: not only a water molecule was responsible for an increase in acidity of the Glu233 but also another water molecule was crucial to perform the final hydrolysis step and complete catalysis, restoring the enzyme's native organization.
More recently, an adiabatic mapping multi-PES QM/MM study by Santos-Martins et al., also conducted in our group, characterized the glycosylation step in 18 conformations of αamylase and further emphasized the role of the water molecule in the rate-limiting step, showing that a hydrogen bond from a water molecule to the catalytic Glu233 was decisive for favorable catalysis. 35 The combination of these results and a 100 ns conventional molecular dynamics (cMD), where a volatile hydrogen bond between a water from the bulk and the Glu233 acid was observed, led the authors to infer that catalytically competent enzyme:substrate conformations are in fact short-lived, 35 further supporting that specific instantaneous active site arrangements are responsible for triggering catalysis by the enzyme. 2,35,36 Another atomistic study of the glycosidic cleavage step by the psychrophilic α-amylase from Pseudoalteromonas haloplankti, led by Kosugi and Hayashi, also pointed out that the reaction was favored by the fast localized reorganization of the active site and by the slow reorganization of a loop adjacent to the binding site at the nanosecond-time scale. 37 On the other hand, the organization of the active site of α-amylase and the resulting distorted chair conformation of the substrate still reveals several charge-complementarity residues that support the dominating electrostatic preorganization paradigm, 3,10,32 namely the relative position of the catalytic Glu233 and Asp197 to the scissile glycosidic bond, the concomitant interaction of the nucleophilic Asp197 with hydroxyl groups of the retained substrate in a syn orientation to the nucleophilic attack, or the Asp300 anchoring the hydroxyl groups of the retained substrate in an anti orientation. 29,38 Objective of the Present Work. In this work, we aim to study the enzyme:substrate conformational dynamics through microsecond cMD simulations, and the rate-limiting chemical step through extended QM/MM MD. Hence, we elaborate a comprehensive discussion on the atomistic determinants of the catalysis by α-amylase.
From the analysis of the most relevant interactions previously identified (Figure 1), 29,35 we discuss: the implications of the active site organization in the enzyme:substrate complex for the substrate binding and for the thermodynamics of the glycosylation step. Ultimately, we present a focused discussion on the role of active site electrostatics and enzyme conformational dynamics in enzyme catalysis, taking α-amylase as a case study. ■ METHODS Parameterizing the Enzyme:Substrate Model. We picked starting structures based on the criteria of: minimal modeling from the X-ray structure of human pancreatic αamylase (PDB ID: 1CPU), 39 and the spread of activation energies determined by the authors of ref 35. A total of 4 models were built based on the coordinates from an X-ray initial structure (PDB ID: 1CPU) and the coordinates at 51.6, 68.7, and 85.7 ns from the 100 ns cMD simulation of ref 35 (here labeled as A, B, and C). The latter correspond to the highest, lowest and in-between activation energies for the glycosylation step of α-amylase as determined from previous work (28.6, 9.3, and 20.1 kcal·mol −1 , respectively). In such a way, we guarantee that we started from nonequivalent reactive enzyme:substrate complexes that led to very different activation energies, despite their structural similarities at the active site. All protonations were assigned as in ref 35 and all disulfide bonds were inspected. The ff99SB force-field was used for all amino acids of the α-amylase model and calcium and chloride counterions, the GLYCAM_06h force-field was used to describe the maltopentose substrate (5-GA), and the TIP3P model was used for water solvent molecules. The models were enclosed in a periodic rectangular box with 67,992 atoms. Details of minimization and equilibration are provided in the Supporting Information.
Microsecond Conventional MD. The Gromacs 2018.3 package 40 was used to run all calculations. Topology and coordinate files were prepared with the parmed python module, as available in AmberTools 18. 41 An initial annealing simulation was run for 100 ps, with constant N and V, up to a reference temperature of 310 K, followed by a 2 ns equilibration of the density of the solvent in an NPT ensemble. An integration time step of 2 fs was used, using the LINCS algorithm 42,43 to constrain all hydrogen covalent bonds, and a cutoff of 10 Å was used to define explicit particle−particle electrostatic and Lennard-Jones interactions. Electrostatic interactions beyond 10 Å were treated with the particle-mesh Ewald summation method, 44 using a cubic interpolation grid and a 1.2 Å Fourier-spacing. The pressure of the system was set to 1 bar, coupled with a 2 ps relaxation time isotropic Berendsen barostat, 45 and a reference temperature of 310 K was set for the solute and solvent, which were coupled independently with a 0.1 ps relaxation time velocity-rescaling thermostat. 46 After equilibration, an initial 100 ns cMD was run for each of the four starting models, in an NPT ensemble at 1 bar and 310 K, with the isotropic Parrinello−Rahman barostat and the velocity-rescaling thermostat. 46,47 In order to produce microsecond time scale simulations, up to a total of 2 μs, productions of 100 ns were iteratively carried out after a clustering analysis based on the root-mean-square deviation (RMSd) of the heavy atoms of the active site of α-amylase depicted in Figure 1, right). 35 Each 100 ns trajectory was spliced every 100 ps for a clustering analysis with the Gromos algorithm, 48 considering an RMSd cutoff of 1.2 Å from the centroid structure. Every centroid structure farthest from the starting structure, and with an occupancy above 10% out of the 100 ns cMD, was selected for a new round of 100 ns cMD. New velocities were generated from a 310 K Maxwelldistribution and a new equilibration was carried out for every cMD iteration. Whenever Gibbs energies were computed, the data (with size N) was binned along N 3 bins, and the Gibbs energy was computed according to the expression ( ) , where k i is the occupation of the bin i, and k 0 is the occupation of the most occupied bin. Umbrella Sampling QM/MM MD. All QM/MM calculations were carried out with the AMBER 18 package, with QM calculations performed externally by the Orca 4.1.2 software, and using the Born−Oppenheimer approximation. 49,50 The electrostatic embedding approach, including the MM RESP charges in the QM Hamiltonian, was used to address the interaction between the QM and MM layers, and the link atom approach was used to address the boundary between the QM and MM layers. A set of 70 atoms was defined to undergo QM calculations at the PBE/def2-SVP level (Figure 1, right): the methyl−carboxylic terminus of acidic Glu233, nucleophilic Asp197, and transition state stabilizer Asp300; the glucose monomers involved in the scissile glycosidic bond of the substrate; and a water molecule near the acidic oxygen of Glu233 and the oxygen of the scissile glycosidic bond (previously shown to be important for catalysis). The combination of PBE 51 and small basis sets has been widely used to study reaction mechanisms of glycosidases, 52−55 as well as to accurately describe the conformational diversity of several carbohydrate units (including the glucopyranose ring composing the 5-GA substrate), 56,57 and it was also observed to accurately reproduce geometries for the glycosylation reaction with moderately low computational effort. 58 The use of single split-valence polarization basis sets was also indicated to reduce overpolarization effects in QM/MM MD simulations. 59 Hence, we assessed the quality of the combination of PBE and the def2-SVP basis set, which compares with the 6-31G** basis set and was previously employed in QM/MM MD studies of biological and systems, 60,61 to describe the glycosylation step of α-amylase, using the ONIOM methodology within the electrostatic embedding approach, as implemented in Gaussian 16. 62 Our results indicated that PBE/def2-SVP:AMBER level of theory reproduced the glycosylation step of α-amylase, having showed results comparable to those by Santos-Martins et al. at a higher level of theory (refer to Tables S1−S3 and Figure S1 in Supporting Information for details).
The energy minimization protocol was carried out with 50 steps of steepest descent search followed by a conjugate gradient search up to 200 cycles of minimization, followed by a short 2 ps MD simulation in the NPT ensemble (1 bar and 310 K) in a periodic box. The Langevin thermostat with a collision frequency of 1 ps −1 , and the Monte Carlo barostat with a relaxation time of 2 ps were employed. Nonbonded interactions were treated explicitly within 9 Å of each particle; beyond that, Coulomb interactions were treated with the particle-mesh Ewald scheme and Lennard-Jones interactions were truncated. An integration step of 1 fs was used to integrate motion equations.
After minimization and equilibration QM/MM calculations, a short 2 ps steered MD 63 was performed for each system along an RC defined as the difference between the distance of the cleaved scissile bond (d glyc ) and the distance of the hydrogen of the acidic Glu233-OH to the oxygen atom of the scissile bond (d acid ), The RC was defined upon inspection of the nuclear vibrations with the highest amplitude from the imaginary frequencies corresponding to the transition states characterized in previous studies.
Umbrella sampling QM/MM MD simulations were performed for the four models along the defined RC. To generate the windows composing each umbrella sampling simulation, a harmonic potential with a force constant of 100− 200 kcal·mol −1 ·Å −2 was centered every 0.10 Å along the amplitude defined for the RC. Starting conformations for each window were chosen as the ones from the trajectory of the steered MD whose RC value was the closest to the center of the harmonic potential applied in each window, and ran for 10 ps (resulting in a total of 450−500 ps per QM/MM model). A detailed description is provided in Supporting Information (section Umbrella sampling QM/MM MD and Figure S2). The data for the four models was then concatenated (total of ∼1.6 ns) and processed with the weighted-histogram analysis method as implemented in the WHAM software, by Grossman et al. 64 During the QM/MM MD simulations, time-dependent block analysis was carried out in the direct and reverse time series to evaluate the statistical convergence of the resulting potential of mean force (PMF) plots for the glycosidic cleavage step, until a difference below 1 kcal·mol −1 was observed between two successive PMFs. Finally, the first 2 ps of each window were considered an equilibration period and discarded (refer to Figure S3 in the Supporting Information for details).
For subsequent analysis, the individual probability of each simulated conformation, p i , was then determined from the WHAM equation, where the denominator is calculated as the sum of the Boltzmann factor e F U k T ( )/ j ij bias B of the difference between the reference energy F j of window j, as retrieved from a prior WHAM analysis, and the bias potential U ij bias of conformation i in window j, for every of the N j conformations in each window. As for the analysis of the MM simulations, projections of the Gibbs energy along different coordinate data sets were calculated after binning the data (with size N) along N 3 bins, according to the expression ΔG = −k B T ln p, where p corresponds to the accumulated probability of every conformation occupying each bin.
■ RESULTS AND DISCUSSION Enzyme:Substrate Dynamics. We have conducted microsecond MM simulations to assess the organization of the active site of α-amylase, its domain dynamics, and the main interactions involved in substrate binding.
The α-amylase exhibits low flexibility when complexed with a maltopentose (5-GA) substrate, on the microsecond time scale, as can be verified by the small variations of the enzyme backbone (below 2.0 Å, Figure 2A) relative to the starting Xray conformation and the root-mean-square fluctuations for the heavy atoms of each residue (refer to Figure S4 in the Supporting Information). When clustering the simulation by the RMSd of the enzyme's backbone using a 1.6 Å cutoff, two most representative conformations were identified with occupancies around 85% and 14%. While the most populated cluster closely resembles the starting X-ray conformation, the latter conformation exhibits an alternative conformation of the L2 loop (here composed by residues Phe295 to Thr314, Figure  2B), which was previously identified to stabilize the glycosylation step in the psychrophilic α-amylase, by compressing the transition state. 37 After clustering the same trajectory using the backbone atoms of the L2 loop using a 2.4 Å cutoff, we again observe that the two most representative clusters show similar occupation (83% and 14%), which suggests that, within the microsecond time scale, the L2 loop is the most flexible structural element of the enzyme:substrate complex.
The analysis of the RMSd of the catalytic residues at the enzyme's active site suggests that the binding of 5-GA to the active site of the enzyme may not be strongly promoted by every catalytic residue in the active site of the enzyme (the RMSd of the catalytic residues and glucoside units subject to cleavage in the active site varies significantly more than that of the backbone, Figure 2A). In fact, binding of the maltopentose substrate (5-GA) to the enzyme occurs mostly through its buried glucoside units, which closely interact with the catalytic Asp300 and residues Trp59 and Gln63, for over 90% of the simulation, whereas the close interactions with the leaving glucoside units are mostly maintained through less than 70% of the simulations (residues Lys200, Tyr151, Glu233, His201, and Leu235 in Figure 3).
In line with previous observations on retaining α-glycoside hydrolases, 53 such as α-amylase, the Cremer−Pople angle representation of the glucoside that will be subject to glycosylation ( Figure S5) indicates that the glucoside is kept in the chair conformation 4 C 1 for most of the simulation, which is the preactivated conformation for catalysis, but it can also change to the skew conformation 2 S O (ΔG around +2.7 kcal· mol −1 ). The latter conformation was suggested to favor TS-like conformations with a planar glucoside ring and a strong oxocarbenium-like character. 65,66 To better understand the interactions between the 5-GA substrate and the catalytic residues of α-amylase, we analyzed the distances describing the deprotonation of Glu233, the nucleophilic attack of Asp197 and the hydrogen bonding interactions with Asp300 (d acid , d nuc , and d hb(Asp300) in Figure  1). 29 The analysis of the relative orientation of Glu233 and Asp197 toward the 5-GA substrate is depicted in Figure 4A. Both these distances span beyond 10 Å in an uncorrelated fashion, describing two clear minima. Nevertheless, the most occupied minimum covers above 20% of the conformational space and it is located on a region where distance metrics suggest that the nucleophilic attack of Asp197 can occur concertedly with the deprotonation of Glu233 (d nuc within 4.7−5.5 Å and d acid within 4.3−5.0 Å), supporting the concerted reaction hypothesis. 29,37 Upon comparison of these distances along the two clusters, short distances are available in both, although shorter d nuc are more frequent in the cluster that most resembles the X-ray conformation ( Figure  S6).
Our analysis suggests that the nucleophilic distance from Asp197 to the glycosidic oxygen atom of the substrate (d nuc ) is the most correlated with the RMSd of the active site, which is apparent from the d nuc vs RMSD active site plot in Figure 4E and the resemblance of Figure 4, parts A and D, whereas the tight interaction of the Asp300-carboxylate group with the 2,3-OH groups of the buried glucoside unit (d hb(Asp300) ) is conserved regardless of the RMSd of the active site ( Figure 4F), contributing actively to the binding of the 5-GA.
The distance of the Glu233-OH to the glycosidic oxygen atom of the substrate (d acid ) varies along the cMD trajectory, as expected by the low accessibility to the glycosidic oxygen provided by the bulk of its bridging glycosidic rings and by its low polarity, which makes it a poor hydrogen bond acceptor. Nevertheless, shorter distances are mostly observed when Glu233 accepts a hydrogen bond from the 3-OH group of the leaving glucoside unit ( Figure 4B). The observation supports that the establishment of this hydrogen bond can increase the probability of Glu233 deprotonation by the glycosidic oxygen. The same does not seem to occur for the Asp197 nucleophile, where d nuc varies independently from the hydrogen bond between the Asp197 and the 6-OH of the buried glucoside unit ( Figure 4C), suggesting that Glu233 may play an important role in proper binding of the leaving glucoside unit, while Asp300 is the most responsible for the binding of the buried glucoside unit.
The Rate-Limiting Glycosylation Step of α-Amylase. Figure 5 (on top) depicts representative stationary points of the glycosylation step retrieved from the accumulated 1.6 ns equilibrated QM/MM MD simulations, upon a clustering analysis of the total simulation with the k-means algorithm.
A thermodynamic analysis of the process, as depicted in Figure 5, shows that the glycosylation step of Asp197 in αamylase occurs in one step with a Gibbs barrier around 13.9 kcal·mol −1 , and it involves a negligible change in the Gibbs energy of the system upon the formation of the glycosylated Figure 3. Left, enzyme residues with the most contacts with the 5-GA substrate (>70%) during the 2 μs MD simulations: residues within 3.5 Å interact closely with the substrate and contribute the most for substrate binding, and residues within 6 Å compose the binding pocket of α-amylase. Right, depiction of the binding site of the 5-GA substrate in white surface representation and of the close contacts with the 5-GA substrate (carbons colored in dark gray and light gray, respectively, for residues establishing contacts for longer than 70% and 50% of the MD simulations).
Asp197. Although the Gibbs barrier is in very good agreement with experimental and computational determinations, the Gibbs reaction energy differs from that determined in other computational studies, where it was determined to be exergonic. 29,37 Upon projection of the Gibbs energy along the d acid and d glyc distances composing the tentative reaction coordinate of the glycosylation step (RC), as depicted in Figure 5 (bottom right), we confirm that, in line with previous work, 29,35 the acid−base reaction by Glu233 and the release of an aglycone upon the nucleophilic attack of Asp197 occur in a concerted asynchronous manner, with the acid−base reaction preceding the nucleophilic attack by Asp197. A closer analysis at the distances along the representative stationary points in Figure 5 indicates that the most significant changes occur from the reactant to the transition state, with an emphasis on the hydrogen bonds formed between the buried water and both the glycosidic oxygen and the Glu233-OH, as well as those formed with the buried glucoside (between the Asp197carboxylate and the 6-OH, and the Asp300-carboxylate and 2,3-OH).
The transition state region is characterized by a broad d glyc range (from 1.8 to 2.3 Å), and a narrow d acid range corresponding to a protonated aglycone, which emphasizes both the loose character of the transition state of the S N 2 substitution and the asynchronous nature of the glycosylation step. A closer analysis of d nuc and the breaking glycosidic bond, d glyc , for each independent starting conformation, suggests that more dissociative transition states result in higher Gibbs barriers for the process ( Figure S7, X-ray and conformation A).
An analysis of the Cremer−Pople polar coordinates (φ and θ) also shows the effects of ring puckering in the α-Dglucopyranose that is glycosylated by the Asp197. While retaining α-glucosidases, which include the α-amylase here studied, were proposed to follow a 4 C 1 → [ 4 H 3 ] ‡ → 1 S 3 pathway during the glycosylation step, 67 a recent study by Alonso-Gil et al. observed that in the amylosucrase of Neisseria polysaccharea, which also belongs to the same family of αamylase, the α-D-glucopyranose of the sucrose substrate was kept in the 4 C 1 chair conformation both at the reactant and glycosylated stages. 53 These results suggest that not all enzymes in the same family may follow similar ring puckering pathways in the glycosylation step. Consequently, we performed a similar analysis for the glycosylation step of αamylase, which we summarize in Figure 6.
In line with the results of Alonso-Gil, the 4 C 1 conformation is also prevalent at the reactant state and for the glycosylated intermediate. In addition, the reactant state may interchange between the 4 C 1 and 2 S O conformations, in line with the results of our cMD simulations. The TS of the glycosylation step then comprises a change in the glucoside ring to the half-chair 2 H 3 conformation, which rapidly converts into the envelope E 3 conformation where the anomeric carbon is most electrophilic, before holding the prevalent 4 C 1 conformation once the nucleophilic attack by the Asp197 is concluded. As previously observed by Biarneś et al. in the Bacillus 1,3−1,4-β-glucanase, 52 we observe that the TS does not correspond to the conformation where the anomeric is the most electrophilic. Interestingly, after performing the same analysis for each of the four starting conformations simulated ( Figure S8), we observe that those in which the 2 S O conformation is dominant led to more endergonic Gibbs reaction energies for the glycosylation step, which were characterized by products both in the B 3,O boat conformation and the prevalent 4 C 1 chain conformation and to TSs farther from the half-chair/envelope TS-like conformations. Previous electronic structure calculations carried out in isolated pyranose rings by Mayes et al. indicated that the interchange between the 4 C 1 and 2 S O conformations is possible, although the latter should be short-lived and fall either to the 4 C 1 or B 3,O conformations. 68 Hence, our higher Gibbs reaction energy is likely the result of the relative abundance of the different conformations of the Asp197glycosylated product, which could not be reproduced with the

Journal of Chemical Information and Modeling
pubs.acs.org/jcim Article conducted adiabatic mapping calculations unless multiple independent QM/MM calculations are performed. 35,69 Hydrogen Bonds during the Glycosylation Step. Regarding Glu233 and Asp197, key for the glycosylation step, a structural analysis suggests that the reaction is favored by the conserved hydrogen bond formed between the Asp197 and the 6-OH of the buried glycosidic unit ( Figure 7D), while the hydrogen bond between 3-OH of the leaving glucoside and the Glu233 acid is less prevalent up to the transition state, when the building negative charge in Glu233 strongly favors the hydrogen bond with the leaving glucoside unit ( Figure 7B).
Together with the Asp197, the Asp300 establishes the more frequent short hydrogen bonds with the buried glucoside ( Figure 7C) and keeps the buried glucoside in place for catalysis. These hydrogen bonds are also significant for transition state stabilization, as longer hydrogen bonds correspond to higher Gibbs energies nearby the transition state region (refer also to Figure S9 for an analysis by starting conformation). The Asp300 also lowers the Gibbs reaction energy, as supported by the estimated energy difference of ∼3 kcal·mol −1 required to weaken hydrogen bonds with Asp300 at the product state (see PROD in Figure 7C), and the higher reaction energies observed for the conformations A and C (refer also to Figure S3 in Supporting Information). The fact that this interaction is less prevalent in the product state than in the reactant state may also explain why the glycosylation step involves a negligible Gibbs energy change, even though the release of the resulting aglycone should then be entropically favored, resulting in an overall exergonic process.
The analysis of the proximity of the buried water to the glycosidic oxygen and the Glu233 acid, in Figure 7A (refer also to Figures S8 and S9) indicates that the availability of water solvent nearby the glycosidic oxygen is favored in the reactant state, regardless of the starting conformation. On the other hand, the hydrogen bond to the Glu233 acid is weak in the reactant state, as it varies from 2.5 to 5.0 Å at a cost within less than 2 kcal·mol −1 , and the interaction becomes more favorable as the nucleophilic attack by Asp197 is performed, which indicates that the water molecule may stabilize the building negative charge of Glu233 upon the formation of the aglycone.
Molecular Determinants for the Catalytic Rate. In order to outline the molecular determinants promoting a high or low catalytic rate of α-amylase, we compared the same molecular descriptors for the least favorable reaction (conformation A), whose Gibbs barriers and reaction energies varied from ∼25 to 16 kcal·mol −1 , against the final equilibrated QM/MM MD simulations.
The most significant difference occurs for the acid−base reaction involving the Glu233, which is most favored when d glyc stretches farther than 2 Å. At this stage, the deprotonation of the Glu233 is favored in over 5 kcal·mol −1 , as shown in Figure 8. Comparing the accumulated first and last 4 ps of the QM/MM MD simulations starting at 51.6 ns, the deprotonation of the Glu233 at the transition state is less frequent for the accumulated data of the initial 4 ps (Figure 8, top). These lower rates of Glu233 deprotonation occur for distances between the nearby buried water and the Glu233 acid (d Glu233−WAT ) above 3.5 Å (Figure 8, bottom), which is also indicated by Figure S10 to result in an increase of the Gibbs barrier by over 5 kcal·mol −1 .
Previous results from multi-PES QM/MM ONIOM calculations suggested that a water nearby the acidic Glu233 modulates the catalytic rate of α-amylase. 35 The water would decrease the pK a of Glu233 through hydrogen-bonding to the carboxylic acid group and thus promote its deprotonation by the glycosidic oxygen. The authors indicated that lower activation energies should be expected for lower sums of d acid + d nuc + d Glu233 at the reactant state, so-called d actsite , which means that more productive conformations for the glycosylation step would require for: the Glu233 acid to be close to the glycosidic oxygen (short d acid ), the Asp197 nucleophile to be close to the anomeric carbon of the buried glucoside unit (short d nuc ), and a water molecule to be within hydrogen bonding distance of the Glu233 acid (short d Glu233−WAT ). 35 The same representation (left-hand side of Figure 9) indicates that two minima are available for the reactant state (at d actsite ≈ 9 and 11 Å), within a barrier below 3 kcal·mol −1 and with a Gibbs energy difference of ca. 1 kcal·mol −1 . Since both d acid and d nuc vary monotonically during the glycosylation step, this result indicates that a direct hydrogen bond between the buried water and Glu233 is only slightly thermodynamically favored, and it might not be a requisite for productive conformations. Instead, the availability of a water molecule within the first and second solvation spheres of Glu233 (between 3 and 6 Å) is required to promote the formation of a short hydrogen bond between the water and Glu233 at the Our results suggest that the availability of water within hydrogen bonding distance of the glycosidic oxygen, represented by d glyc−WAT , is a better criterion to characterize productive conformations, together with d acid and d nuc . In fact, the short interaction between the glycosidic oxygen and the water is thermodynamically more favorable than that between the water and the Glu233, as was confirmed in Figure 7A, and does not impair the hydrogen bond formed with the Glu233 toward the transition state. Hence, during the glycosylation step, the hydrogen bond formed between the water and the glycosidic oxygen (d glyc−WAT ) increases as the Glu233 acid becomes closer to the glycosidic oxygen (d acid ) to carry out the acid−base reaction leading to the formation of the transient oxocarbenium glucoside, and the Asp197 nucleophile attacks the anomeric carbon of the resulting oxocarbenium glucoside (d nuc ) leading to the glycosylated Asp197. By combining these three distances, [d acid − d glyc-WAT ] + d nuc , we obtain a monotonic variable with the three clear stationary points characteristic of the glycosylation step (refer to the right-hand side of Figure 9). As a result, we believe that this combination might be a better alternative to quantify productive conformations to carry out the rate-limiting glycosylation step of α-amylase.
We also inspected the behavior of the L2 loop throughout the glycosylation step to assess how it could contribute for catalysis. However, our clustering analysis over the RMSd backbone atoms of the residues comprising the loop indicated that the representative structures of each cluster are similar (0.6 Å) when compared to those from the MM MD simulations (2.4 Å). Since each of our independent QM/ MM MD simulations are well below the nanosecond time scale, we cannot infer the contribution of the conformation of the L2 loop for the glycosylation step, although we observe that the glycosylation step could still be carried out efficiently when no appreciable changes were observed in its overall conformation.
Ultimately, we observe that the catalytic rate of α-amylase is ensured by the tight binding of the buried glucoside unit by strong hydrogen bonds from its 2,3,6-OH groups to Asp300 and the nucleophilic Asp197 itself. Then, because of the extended conformation in which α-amylase binds the 5-GA substrate, the glycosidic oxygen bridging the buried and leaving glucosides exhibits a strong ability to accept hydrogen bonds from waters in the bulk solvent, which will lower the pK a of the Glu233 acid upon proximity and favor its deprotonation to stabilize the oxyanion resulting from the nucleophilic attack of Asp197 at the anomeric carbon.

■ CONCLUSIONS
We elaborate on a comprehensive discussion on the atomistic determinants of enzyme catalysis by human pancreatic αamylase, with quantitative and qualitative detail, by combining extensive conventional and QM/MM MD simulations for a set of distinct enzyme:substrate conformations.
Microsecond MD simulations showed that the α-amylase exhibits a very rigid backbone dynamics, as opposed to that of its active site. In particular, the enzyme was observed to interact more closely with the buried part of the substrate, where the nucleophilic attack by Asp197 occurs, through contacts below 3.5 Å with the catalytic Asp197 and Asp300 and the Trp59, Gln63, Thr163, and Lys200 residues. Despite the tight binding or the buried substrate for over 70% of the simulated time, favorable catalytic distances for the Glu233 acid and the Asp197 (below 5 Å) were only observed for about 20% of the simulated time. The catalytic distance describing the nucleophilic attack of Asp197 to the anomeric carbon of the buried glucoside was observed to be the most affected by changes in the active site, which suggests that the glycosylation step can be regulated by selectively tuning the binding of the buried part of the substrate.
Umbrella sampling QM/MM MD simulations amounting to the nanosecond time scale, at the DFT level, resulted in a kinetic and thermodynamic characterization of the ratelimiting glycosylation step in very good agreement with experimental and computational works published to date. Our calculations confirmed the loose and asynchronous nature of the glycosylation step in α-amylase, which consists of a onestep S N 2-like nucleophilic substitution of the Asp197 on the anomeric carbon of the buried glucoside of the substrate, preceded by the protonation of the glycosidic oxygen by the Glu233 acid to release the aglycone molecule. In line with the results from other α-glucosidases, the buried glucoside of the substrate binds more favorably in a 4 C 1 chair conformation that changes to the half-chair 2 H 3 conformation at the transition state, rapidly converting to the envelope E 3 conformation where the anomeric carbon is most electrophilic. The product state is again characterized by a 4 C 1 chair conformation. An alternative 2 S O skew conformation leads to transition states farther from the half-chair/envelope TS-like conformations and to an Asp197-glucosyl product that can interchange between the 4 C 1 chair and the more endergonic B 3,O boat conformation.
An analysis of the main interactions involved in the glycosylation step clarified that the glycosidic oxygen of the leaving glucoside unit is particularly responsible for the retention of a bulk water molecule, which promotes enzyme catalysis by lowering the pK a of the Glu233 acid. This buried water was confirmed to form a short hydrogen bond with the Glu233 at the transition state, and consequently promote the acid−base reaction with the glycosidic oxygen. The latter reaction is key to lower the Gibbs barrier of the nucleophilic attack of the Asp197 by over 5 kcal·mol −1 . We observed that the buried water interacts more pronouncedly with the glycosidic oxygen of the substrate, and we finally propose a simple collective variable consisting of a linear combination of distances of the Glu233 and the buried water to the glycosidic oxygen and the distance corresponding to the nucleophilic attack of Asp197 to the anomeric carbon of the buried glucoside, which can be used to identify reactive conformations of enzyme:substrate complexes for α-amylase.
These results are an important contribution for the accurate characterization of enzyme catalysis and dynamics, and they should contribute to improved rational drug design and enzyme engineering campaigns.

■ DATA AND SOFTWARE AVAILABILITY
Topology and coordinate files for the solvated α-amylase:5-GA complex are available in the Supporting Information in machine-readable format for Gromacs 2018.3 (SI_gmx-md_inputs.zip) and AMBER18 (SI_amber-md_inputs.zip). Conventional molecular mechanics MD simulations were performed with the Gromacs 2018.3 software, freely available for download at https://manual.gromacs.org/documentation/ 2018.3. Required files to reproduce the simulations are available in the Supporting Information (SI_gmx-md_inputs.zip). Data collection and analysis were performed with the trjconv, rms, mindist, distance and cluster modules of Gromacs 2018.3, and the collected data is available in the Supporting Information (rawdata_md.txt). Cremer−Pople puckering angles were calculated from the cp.py python script made available by Hill and Reilly (DOI: 10.1021/ci600492e) and are available in the Supporting Information (cremer-pople_md.txt and rawdata_us-qmmm.txt). QM/MM simulations, as well as DFT calculations in cluster models, were performed with the Orca 4.2.1 software, available free of charge for academic purposes, upon registration, at https://orcaforum.kofo.mpg.de, and the sander module of the AMBER 18 package, also available free of charge in AmberTools 18. The AMBER package is a licensed software that can be purchased via the https://ambermd.org Web site for both academic and industrial use. The AmberTools package is available free of charge, upon registration, at https://ambermd.org/GetAmber. php#ambertools. Required files to reproduce the QM/MM simulations are available in the Supporting Information (SI_orca-amber-qmmmm-md_inputs.zip). Data collection and analysis of the QM/MM simulations was performed with the cpptraj module available in the AMBER 18 and AmberTools 18 packages, and the collected data is available in the Supporting Information (rawdata_us-qmmm.txt). Free energy analysis of umbrella sampling simulations was run with the WHAM 2.0.11 software, freely available for download at http://membrane.urmc.rochester.edu. In-house python scripts were developed using Python 2.7.18, available free of charge at https://www.python.org/downloads/release/ python-2718, and the python modules pandas 0.24.2 (https://pypi.org/project/pandas/0.24.2), numpy 1.14.0 (https://pypi.org/project/numpy/1.14.0), scipy 1.2.3 (https://docs.scipy.org/doc/scipy/release.1.2.3.html), matplotlib 2.2.5 (https://matplotlib.org/2.2.3/contents.html). The ONIOM calculations to confirm the choice of the PBE/def2-SVP level of theory for carrying out the QM/MM simulations were conducted in the Gaussian 16 software, which is a licensed software that is available for purchase at https:// gaussian.com/pricing. Gaussian input files of the optimized reactant and transition state stationary points are provided in Supporting Information (SI_oniom-qmmm_inputs.zip).
Details of the Methods section, validation of the choice of the PBE/def2-SVP level of theory to conduct the QM/MM simulations ( Figure S1 and Tables S1−S3), density plots of RC and Gibbs energy profiles along the umbrella sampling QM/MM MD simulations ( Figures  S2 and S3), and auxiliary plots concerning enzyme substrate dynamics (Figures S4−S6)